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1 Introduction 

O Poisson approximation to many discrete distributions (notably the Poisson-binomial distribution) has re- 
^ ceived extensive attention in the literature and many different approaches have been proposed. The main 
problem is to study the closeness between the discrete distribution in question and a suitably chosen Pois- 
j>! son distribution. Applications in diverse problems also stimulated much of its recent interest among prob- 
abilists and scientists in applied disciplines. We propose in this paper a new, self-contained approach 
to Poisson approximation, which leads readily to many new effective bounds for several distances studied 
before, including total variation, Kolmogorov, Wasserstein, Kullback-Leibler, point metric, and x^; see be- 
low for more information and references. In addition to the application to these distances, we also attempt 
to survey most of the quantitative results we collected for the Poisson approximation distances discussed 
in this paper. 

1.1 A historical account with brief review of results 

We start with a brief historical account of Poisson approximation, focusing particular on the evolution of 
the total variation distance; a more detailed, technical discussion will be given in Section 6. For other 
surveys, see [38, 9, 4, 22, 72]. 
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The early history of Poisson approximation. Poisson distribution appeared naturally as the limit of 
the sum of a large number of independent trials each with very small probability of success. Such a limit 
form, being the most primitive version of Poisson approximation, dates back to at least de Moivre's work 
[32] in the early eighteenth century and Poisson's book [61] in the nineteenth century. Haight [38] writes: 
". . . although Poisson (or de Moivre) discovered the mathematical expression (1.1-1) [which is e^'^A^/A;!], 
Bortkiewicz discovered the probability distribution (1.1-1)." And according to Good [37], "perhaps the 
Poisson distribution should have been named after von Bortkiewicz (1898) because he was the first to write 
extensively about rare events whereas Poisson added little to what de Moivre had said on the matter and 
was probably aware of de Moivre's work;" see also Seneta's account in [74] on Abbe's work. In addition 
to Bortkiewicz's book [17], another important contribution to the early history of Poisson approximation 
was made by Charher [21] for his type B expansion, which will play a crucial role in our development of 
arguments. 

The next half a century or so after Bortkiewicz and Charlier then witnessed an increase of interests in 
the properties and applications of the Poisson distribution and Charlier's expansion. In particular, Jordan 
[47] proved the orthogonality of the Charlier polynomials with respect to the Poisson measure, and con- 
sidered a formal expansion pair, expressing the Taylor coefficients of a given function in terms of series 
of Charlier polynomials and vice versa. A sufficient condition justifying the validity of such an expansion 
pair was later on provided by Uspensky [83]; he also derived very precise estimates for the coefficients in 
the case of binomial distribution. His complex-analytic approach was later on extended by Shorgin [80] to 
the more general Pois son-binomial distribution (each trial with a different probability; see next paragraph). 
Schmidt [73] then gives a sufficient and necessary condition for justifying the Charlier- Jordan expansion; 
see also Boas [13] and the references therein. Prohorov [65] was the first to study, using elementary argu- 
ments, the total variation distance between binomial and Poisson distributions, thus upgrading the classical 
limit theorem to an approximation theorem. 

From classical to modern. However, a large portion of the development of modern theory of Poisson ap- 
proximation deviates significantly from the classical line, and much of its modern interest can be attributed 
to the pioneering paper by Le Cam [54], extending the previous study by Prohorov [65] for binomial dis- 
tribution. Le Cam considered particularly the sum Sn of n independent Bernoulli random variables with 
parameters pi,p2, . . . ,Pn, respectively, and proved that the total variation distance 

between the distribution of Sn (often referred to as the Poisson-binomial distribution) and that of a Poisson 
with mean A := is bounded above by 

whenever := maxjpj ^ 1/4, where 6 := A2/A, A2 := Zli^j^nPi- He also proved in the same paper 
the following inequality, now often referred to under his name, 

djvi'^iSn),^{\))^X2. (1.1) 

These results were later on further improved in the literature and the approach he used became the source 
of developments of more advanced tools; see Table 1.1 for a selected list of known results of the simplest 
form div ^ c9. 
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Author(s) 


Year 


d-TV ^ 


Assumption 


Approach 


Le Cam 


1960 


86* 




Operator and Fourier 


Kerstan 


1964 


1.056' 


P* ^ 4 


Operator and Fourier 


Chen 


1974 


56 




Chen- Stein 


Barbour and Hall 


1984 


6 




Chen- Stein 


Presman 


1985 


2.086 




Fourier 


Daley and Vere- Jones 


1988 


0.716 


p* ^ i 


Fourier 



Table 1: Some results of the form djv '■= djv{-^{.Sn), ^ c6. Here 6 := A2/A and := maxjPj. 

It is known that dTvi-^iSn), ^{^)) ~ 6/ \j2'ne when 6 0; see Deheuvels and Pfeifer [30] or Hwang 
[43]. Numerically, 1/V27ce ^ 0.242. 

Form Table 1 . 1 , we should point out that the leading constant in the first-order estimate for djv is often 
less important than the generality of the approach used, although the pursuit for optimal leading constant is 
of independent interest per se. One reason is that if an approach is quickly amended for obtaining higher- 
order estimates, then one can push the calculations further by obtaining more terms in the asymptotic 
expansions with smaller and smaller errors, so that the implied constants in the error terms matter less (the 
derivation of which often involves detailed calculus). 

On the other hand, estimates for the total variation distance between the distribution of Sn and a suitably 
chosen Poisson distribution has been the subject of many papers in the last five decades. Other forms in the 
literature include djv ^ ^(6), dry ^ 'f{6, maxj pj), djy ^ Lp{6, A), . . . , for certain functionals Lp (ip not 
the same for each occurrence). Thus it is often difficult to compare these results; further complications arise 
because some metrics are related to others by simple inequalities and the results for one can be transferred 
to the others; also the complexity of the diverse methods of proof is not easily compared. Despite these, 
we quickly review those that are pertinent to ours, a more detailed, technical comparative discussion for 
some of these will be given later; the special case of binomial distribution will however not be compared 
separately; see, for example, Prohorov [65], Vervaat [84], Romanowska [67], Matsunawa [56], Pfeifer 
[59], Kennedy and Quine [48], Poor [63]. 

Kerstan [49] refined some results of Le Cam [54] on djy by a similar approach. He also derived 
a second-order estimate. Herrmann [39] further extended results in Kerstan [49] in two directions: to 
sums of random variables each assuming finitely many integer values and, in addition to higher-order 
estimates from the Charlier expansion, to signed measures whose generating functions are of the forms 
i^^si^^)'' ^-^ji^ ~ I j)- comment on Kerstan's and Herrmann's second-order esti- 

mates later. As far as we are aware, Herrmann [39] was the first to use such signed measures for Poisson 
approximation problems, although such approximations are later on referred to as Komya-Presman or 
Kornya-type approximations, the two references being Komya [52] and Presman [64]. Note that the idea 
of using other signed measures (binomial) were already discussed in Le Cam [54]. Serfiing [75] extended 
Le Cam's inequality (1.1) to dependent cases; see also [76]. Chen [23] proposed a new approach to Poisson 
approximation, based on Stein's method of normal approximation (see Stein [78]). 

From 1980 on, most of the approaches proposed previously for Poisson approximation problems re- 
ceived much more attention and were further developed and refined. Among these, the Chen-Stein method 
(with or without couplings) is undoubtedly the most widely used and the most fruitful one. It is readily 
amended for dealing with dependent situations, but leads usually to less precise bounds for numerical pur- 
poses. On the other hand, direct or indirect classical Fourier analysis, although involving less probability 
ingredient and relying on more exphcit forms of generating functions, often gives better numerical bounds. 
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For these and other approaches (including semigroup with Fourier analysis, information-theoretic), see 
Deheuvels and Pfeifer [28], Stein [78], Aldous (1989), Barbour et al. [9], Steele [81], Janson [46], Roos 
[69, 70], Kontoyiannis et al. [51] and the references therein. 



1.2 Our new approach 



The new approach we are developing in this paper starts from the integral representation for a given 
sequence {A„}„^o (satisfying certain conditions specified in the next section) 



where A > and 



E 

n>0 



Kr) 



^7(^A7A)rfr, 



(1.2) 



1 

2^ 



re 



dt. 



Note that /(r) = J2n>o knP^^"' where a„ denotes the coefficient of in the Taylor expansion of 



e S,>o A.j{l + zy . This means that (1.2) can be written in the form 



f,-X}^ ~ \n' 
n>0 n\ n>Q 



which, as far as we are aware, already appeared in the paper PoUaczek-Geiringer [62], but no further use 
of it has been discussed; see also Jacob [45], Schmidt [73], Siegmund-Schultze [77] and the references 
cited there. Also the series on the right-hand side is in almost all cases we are considering less useful than 
the integral in 1.2. 

The seemingly strange and complicated starting point (1.2) turns out to be very useful for develop- 
ing effective tools for most Poisson approximation problems. Other ingredients required are surprisingly 
simple, with very little use of complex analysis. A typical result is of the form 

where (y/e — l)/V^ ~ 0.46; see Theorem 3.4. The relation (1.2), which will be proved below, is based 
on the orthogonality of Charlier polynomials and Parseval identity; thus we call it the Charlier-Parseval 
identity. 

Other features of our approach are: first, it reduces the estimate of the probability distances to that 
of certain integral representations with a similar form to the right-hand side of (1.2), and thus being of 
certain Tauberian character; second, it can be readily extended to derive asymptotic expansions; third, the 
use of the correspondence between Charlier polynomials and Poisson distribution can be quickly amended 
for other families of orthogonal polynomials and their corresponding probability distributions; fourth, the 
same idea used applies equally well to the de-Poissonization procedure, and leads to some interesting new 
results, details being discussed elsewhere. 



Organization of the paper. This paper is organized as follows. We begin with the development of our 
approach in the next section. Then except for Section 6, which is focusing on reviewing and comparing 
with known results, the next three sections consist of applications of our Charlier-Parseval approach: Sec- 
tion 3 to several distances of Poisson approximation to Sn for large A, Section 4 to second order estimates. 
Section 5 to approximations by signed measures. 
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2 The new Charlier-Parseval approach 



Crucial to the development our approach is the use of Charlier polynomials, so we first derive a few 
properties of Charlier polynomials we will need. 



2.1 Definition and basic properties of Charlier polynomials 

The Charlier polynomials Ck{\, n) are defined by 



A" 

V C,(A, n)-z- = {z- 1) (fc = 0, 1, . . . )• (2.1) 



n 

n>0 



Multiplying both sides by 2; — 1, we see that 

A'^-i „ . A" „ . A 



Cfc(A,n- 1) - ^Cfc(A,n) = — Cfc+i(A,n), (2.2) 

which implies that the Charlier polynomials ipk{n) := Cfc(A, n) are solutions to the system of difference 
equations XLpk{x — 1) — X^pk^x) = Xipk+i{x), with the initial condition ipo{x) = 1. In particular, 

Ci{X,n) = ^— and C2{X,n) = — . (2.3) 

An alternative expression for Ck{X, n) is given by 

— Cfc(A,n) = e^— e-^ — , 
n\ aX'^ n\ 

which follows from substituting the relation {z — 1)^6^^ = {d'^ / dX'') e^*^^"^^ into (2.1). 
Since by (2.1) 

C,(A,r^)- = [;^^(z-l)V^ (2.4) 
where [z"]0(z) denotes the coefficient of z"- in the Taylor expansion of (f){z), we have, for each fixed n, 

\n \k \k 

= ^(1 +w)™e^^". 
nl 



It follows that 



S2CniX,k)^w'' = {l+wfe 
^-^ nl 

n>0 



k —\w 



Comparing this relation with (2.1), we obtain the property Cfc(A, n) = (— 1)""'"''C„(A, A;), for all k,n ^ 0. 
Another important property we will need is the following orthogonality relation (see [79, p. 35]). 
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Lemma 2.1. The Charlier polynomials are orthogonal with respect to the Poisson measure e /n\, 
namely, 

Y,Ck{X,n)Ce{X,n)e~^— = (2.5) 



where 6a,b denotes the Kronecker symbol. 

For self-containedness and in view of the importance of this orthogonality relation to our analysis 
below, we give here a proof similar to the original one by Jordan [47]. 

Proof. We start from the expansion 

C.(A,„)= ^ ('')(-i)>-. "("-l)-.<"-^ + l) , ,2.6) 

which follows directly from (2.4). Differentiating both sides of (2.1) j times with respect to z and substi- 
tuting z = 1, we get 

y e-^\Ck{X,n)nin- !)■■ -{71-3 + 1) = l-^- ' 

which means that the Charlier polynomials Ck{X,x) are orthogonal to any falling factorials of the form 
x{x — 1) ■ ■ ■ (x ~ j + 1) with j < k with respect to the Poisson measure. Now without loss of generality, 
we may assume that i ^ k. Then applying (2.6), we get 

J2 ^''^C,iX, n)CeiX, n)= J2 i\ {-ly-^X-^ Yl V^'^^^' " 1) " " " - J + 1) 

E (^)i-iY-^x-^6,,k\ 



k\ 



This completes the proof. □ 

2.2 The Charlier-Parseval identity 

Assume that we have a generating function 

F{z) = Y,^nZ\ 

which can be written in the form 

F{z)=e^^^-^^f{z). (2.7) 

Let 

f{z) = Y,<^,{z-iy. 
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Then, by (2.4), we have formally the Charlier- Jordan expansion 



A 



An = e ^—^ajCj{X,n), 



(2.8) 



and we expect that An will be close to e ^X^/n\ if f(z) is close to 1, or, alternatively, if oq is close to 1 and 
all other a/s are close to 0. The following identity provides our first step in quantifying such a heuristic. 

Proposition 2.2 (Charlier- Parseval identity). Assume that f{z) is analytic in the whole complex plane and 
satisfies 



1/(^)1 = 0(e^l^-i|^ 



(2.9) 



as \z\ — ^ oo. Then for any A > 2H 



E 

n>0 



An 



where 



I(r) 



A" 



nl 



1 

2^ 



\f{l + re^')\'dt 



(2.10) 



(2.11) 



Proof. Since by definition /(r) = Xlj>o kiP^^'' ^'^^ '^he condition (2.9) implies the convergence of the 
series Xlj^o kiPj'/-^''' it follows that 



(2.12) 



Both the series and the integral are convergent because, by (2.9), /(r) = 0(e^^^^). 
Again by definition 

Anz^ = e^(^-^) - • 

Taking coefficient of 2;" on both sides, we obtain (2.8), which can be written as 

^ n! j:sO 

where the convergence of the above series is pointwise. But the convergence of the series in (2.12) implies 
that the series on the right side also converges in L2-norm with respect to the Poisson measure e^^A"/n!. 
Thus the Proposition follows from (2.5). □ 



In the special cases when F{z) 

n>0 



l^k^\{z-i)^ or An = Cfc(A, n)e"^A"/n!, we have the identity 



\Ck{X,n)\' = k\X- 



(fc = 0,l,...), 



which is nothing but (2.5) with k = i. This implies that 

A 



n>0 



nl 



Cki\,n)\^Vk^.\-^/^ {k = 0,l,...). 



(2.13) 



7 



2.3 A probabilistic interpretation of the Charlier-Parseval identity 

Assume that F{z) is a probability generating function of some non-negative integer valued random variable 
X having the form 

F{z) := J2 ^(^ = ^)^"' = e^^^"^^ Yl ^^^^ ~ '^y- 
Applying the Charlier-Parseval identity (2.10) and (2.12) to F gives 

2 



E 

m>0 



P(X = m) 



ml 



E 



provided that both series converge. In view of the orthogonality relations (2.5), the coefficients aj can be 
expressed as 

A^' X - A-' 

^ —ECJX,X). 



m] 



Thus 



E 

m>0 



P(X = m) 



-A 



A^ 



A^' 



m! 



5^-|EC,(A,X)| 



This identity relates the closeness of X to Poisson measure by means of the moments of X since the 
quantity ECj(A, X) is a linear combination of the moments of X. 

On the other hand, it is also clear, by Cauchy-Schwarz inequality, that the series on the right-hand side 
satisfies 

(EE,^ia,Q(A,X)^ 



5^^|EC,(A,X)| 



J! 



sup 



where the supremum is taken over all real sequences {aj}j^i such that J2j>i c^ji'/^'' < Let 

9{x) := YajCj{X,x). 



Then 



sup 



(EE,^,a,C,(A,X) 



sup 

E3(C)=0 



where C is a Poisson random variable with mean A. 

Applying the difference equation (2.2) for Charlier polynomials and taking into account that 
¥.g{X) = 0. we then have 

9{X) = i 5^a,E(XC,_i(A,X - 1) - AC,_i(A,X)) = \{Xh{X - 1) - Xh{X)), 



where h{x) = ^,>i afcCj_i(A, x). Thus we can write 



E 

\m>0 



F{X = m) 



1/2 



ml 



supE{Xh{X - 1) - Xh{X)), 



the supremum being taken over all functions h such that E,[(h(( — 1) — \h{()) = 1. The right-hand 
side of the last expression is reminiscent of the Chen-Stein equation; see the book [9]; see also Goldstein 
and Reinert [36] and the references therein for the connection between orthogonal polynomials and Stein's 
method. 



2.4 Asymptotic forms of the Charlier-Parseval identity 

The identity (2.10) can be readily extended to the following effective (or asymptotic) versions for large A. 

Proposition 2.3 (Asymptotic forms of the Charlier-Parseval identity). Let F{z) and f{z) be defined as 
above. Assume that f is an entire function and satisfies the condition 

\f{z)\^Ke''\^-^^\ (2.14) 

for all 2; G C, with some positive constants K and H. Then uniformly for all N ^ and A ^ {2 + 6)H 
with e > 

2 

An 



E 

n>0 



n] 



E 

n>0 



n! O^j^N 

A 



-A^<;^2 2+5 f{2 + e)H 



nl 



n] 



A 



N+l 



^ K 



2 + e f{ 2 + e)H 
A 



{N+l)/2 



(2.15) 
(2.16) 



and uniformly for alln ^ 

A" 



An — e 



-A- 



n! 



J2 



n] 



€ K- 



A 



(2.17) 



Proof Applying (2.10) with A = (2 + £:)if and using the upper bound /(r) ^ K^e^^''^ (by (2.14)), we get 

f^o {{2 + e)HY 



{2 + e)H 



'0 



K 



^2 + e 



Applying again Proposition 2.2 but to the function f{z) = g{z) — Xlo^j^Af '^ii^ ~ ^'^^ using the above 
estimate for A ^ (2 + e)H, we get 

2 



E 

n>0 



A 



- ^ ajCj{X, 



n] 



j>N 
1 



|2Ji 

' A^' 



j>N 



{{2 + e)H) 



N+l 



^ K 



^2 + e f{2 + e)H^''^' 



Thus (2.15) follows and the estimate (2.16) is an immediate consequence of Cauchy-Schwarz inequality. 
For (2.17), we apply Proposition 2.2 to the function 



:i - ( / w - E 



aj{z — ly 



and obtain 



E 

n>0 



n] 



|a,f (j + 1)! 



By partial summation, (2.2) and Cauchy-Schwarz inequality 



An - e 



A" 



n] 



ml 



A — A 

^m— 1 

«-aA!1 



1/2 



1/2 



\j>N 



(2.18) 



Now for A ^ (2 + 6)H 



E^4r^(v^)-- 



ri>0 







T^2 roo 

A Jo 



A(l -2/7/A)2' 
Thus (2.17) follows from substituting this bound into (2.18). 



□ 



2.5 Some useful estimates of Tauberian type 

We now derive a few other effective bounds for certain partial sums or series by applying the Charlier- 
Parseval bounds we derived above; these bounds are more suitable for use for the diverse Poisson approxi- 
mation distances we will consider. They are the types of results that have more or less the flavor of typical 
Tauberian theorems. 

Assume that (^x is a Poisson(A) distribution. Denote by 

Z{n) = min {P(Ca ^ P(Ca > n)} . 

It is clear that Z(ra) ^1/2. 
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Proposition 2.4. Let F, /, An, an and I be defined as in (2.7) and (2.11). Assume that f{z) is an entire 
function and satisfies the condition (2.9). Then for A > 2H the following inequalities hold. For n ^ 0, 



n>0 



1/2 



1 



UJ ^ — I(^VJX)re-' dr ] ^fz(n). 



1/2 



(2.19) 
(2.20) 



If we additionally assume that F{\) = 0, then for n ^ 0, 

y2\Ao + A, + --- + An\^Vx( r I(^VJX)r~^e~'- dr\ 

n>0 ^-^0 ^ 



1/2 



\Ao + Ar + --- + An\^[ I Iiy^)e-''dr]^ 



(2.21) 
(2.22) 



Proof. By Cauchy-Schwarz inequality 



n>0 



n>0 n\ 



-A' 



nl 



1/2 



n! 



1/2 



E 

n>0 



Ar, 



1/2 



A^ 



nl 



The upper bound (2.19) then follows from (2.10). 

The third inequality (2.21) is proved by applying (2.19) to the function -^1(2;) := F{z)/(^1 — z). Note 
that the condition F(l) = implies that Fi{z) is regular at z = 1. With this Fi, (2.19) now has the form 



Y^\Ao + A, + --- + An\^ ( r /i( V^)e-^ dr) 



1/2 



where 



hir) 



27rr2 



\f{l + re'')\'dt = I{r)y 



and (2.21) follows. 

For the fourth inequality (2.22), we start from applying the Cauchy-Schwarz inequality, giving 



\Ao + A^ + --- + An\^ \J2 

J>0 



A, 



1/2 



A^' 



A^' 



1/2 



(2.23) 



On the other hand, the condition -F(l) = implies that X]j>o ~ Consequently, 

\Ao + Ai + --- + An\ = \An+l + An+2 + " " " | 



A, 



2 \ 1/2 
A^' 



-A- 



1/2 



(2.24) 



Taking the minimum of the two upper bounds (2.23) and (2.24), we obtain (2.22). 

Finally, the second inequality (2.20) follows from (2.22) by applying it to the generating function 

d -;z)F(';z) instead of Ffz). □ 
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3 Applications. I. Distances for Poisson approximation 



We apply in this section the diverse tools based on the Charlier-Parseval identity and derive bounds for the 
closeness between the Pois son-binomial distribution and a Poisson distribution with the same mean. We 
need a few simple inequalities. 

3.1 Lemmas 

Lemma 3.1. The inequalities 

\{l + z)e-'\^e\'\''^ (3.1) 



^ c„|^r-*-lel^l^/^ (3.2) 



hold for all z G C, where m ^ 1 and 



m 



1 /"^ 

-/ e'''^{l-t)'^-\m-l + t)dt. (3.3) 
n\ Jq 



Proof. Write z = re**, where r > and t G M. Then, by 1 + x ^ for x G 



1(1 + z)e-'\ = Vl + 2rcost + r2e-"™'* 

^ g-r cos r cos f 

= 



y/2 



For (3.2), we start with the relation 



(m- 1)170 



e — 
and deduce that 

1 — 1 ^m+l rl 

{l-z)e'+ J2 —}-^' = -^- / e''{l-tr~\m-l + t)dt, 

for m ^ 1. Thus (3.2) follows from the inequality \tz\ ^ \z\'^/2 + □ 
Remark 3.2. Note that in the proof of (3.2), we have the inequality 

\ (x — l)e^' 

\ ^ ci = - 1 = 0.64872 ... (x G M), 

which can easily be sharpened, by elementary calculus, to 

1 + (x - l)e^ 

^-^^7^ ^ 0.63236 .... 

But this improvement over c\ is marginal, so we retain the simpler upper bound c\ in the following use. 
The next lemma is crucial in applying our Charlier-Parseval bounds derived above. 
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Lemma 3.3. The inequality 



holds for any complex numbers {vk}, where 



\Vk\ 



Proof. By partial summation 



(3.4) 



(3.5) 



n n n ^0 n h 

l^fc^n l^A,<n l^fc^n \l^j<fc / \k<j^n / 

for nonzero 1^^} and {//a:}- Applying this formula, we get 

n (l + Vk)e-^'^ - 1 = J2 {a + ^k)e-^'' - ^) n (l + ^.)e- 

ISCfesCn l^fcsCn l£^<k 

By the two inequalities (3.1) and (3.2) with m = 1, we then obtain 



n 



and (3.4) follows. 



(3.6) 



□ 



3.2 New results 

We are ready to apply in this section the tools we developed above to derive bounds for several Poisson 
approximation distances. 
Let 

Sn '■= + ^2 + ■ ■ ■ + Xn, 

where the Xfs are independent Bernoulli random variables with 

F{Xj = 1) = 1 - F{Xj = 0) = pj (1 ^ J ^ n). 
Then, here and throughout this section. 



F{z):= J2 nSn = m)z"'= J](g^.+p^.; 



where qj := 1 - pj. Define A„ := Xli^j^nPf' = and 9 := A2/A1. 
Let ^(A) denote a Poisson distribution with mean A. 

Theorem 3.4. We have the following estimates: {i)for the -distance 



d,.(if(5„),^(A)) :=5^ 



m>0 



= m) 



ml (1 



(3.7) 
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(//) for the total variation distance 



m>0 



F{Sn = m)-e~ 



ml 



ci9 



^2(1-^)3/2^ 



and {Hi) for the Wasserstein (or Fortet-Mourier) distance 



rn>0 



^ m) — 



ciAs 



A(l-^^) 



We also have the following non-uniform bounds for m ^ 0; {iv)for the Kolmogorov distance 



V2Cl^ r—- 

< -TTTT \/ Z{m); 



(1-9) 



3/2 



and (v) for the point metric 



F{Sn = m)-e 



< = \JZ{m). 



Proof. For [i), we apply (2.10) to the function F{z) — e'^(^~^) and use the inequality (3.4) with Vj = pjve^^ 
to estimate the integral /. This yields 



lir) 



2tx 



W {l+pjre'')e 



dt 



,2 \2^4„A2r2 



(3.8) 



hence 



oc poo 

I{^VJ\)e-'' dr ^ cle^ / r'e-^'^^-^) dr 

Jo 

_ 2cie^ 

~ (1-^)3' 

and the estimate in [i) for the x^-distance follows. 

Similarly, the inequalities in (//) and in [iv) follow from substituting the estimate (3.8) into the two 
inequalities (2.19) and (2.22) respectively. 

As to the non-uniform estimate in (v) for the point metric, we have, again, by (3.8), 

Jo Jo 



(1- 



Substituting this estimate in (2.20) gives the inequality in (v). 

Finally, the upper bound in (Hi) for dyy is derived similarly by the inequality (2.21) using again (3.8) 



roo 2n2 

This completes the proof of the theorem. 



□ 
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The reason of studying the x^-distance (also referred to as the quadratic divergence) is at least twofold 
in addition to its applications in real problems. First, it is structurally simpler than most other distances 
because it satisfies the following identity. 

Corollary 3.5. Let {aj} be given by 



F{z) 



^aj{z-iy, 



J>2 



where F is given in (3.7). Then the -distance satisfies the identity 



i>2 



Proof. By (3.9), we have 



F{Sn = m)-e 



-A ' 



ml 



A 



m). 



i>2 



Then (3.10) follows from (2.12). 



(3.9) 



(3.10) 



(3.11) 



□ 



Second, the x^-distance is often used to provide bounds for other distances; see [14]. An example is as 
follows. 



Corollary 3.6. The information divergence (or the Kullback-Leibner divergence) satisfies 



dKLmSn), ^(A)) := J2 ^(Sn = m) log 



.2/12 



2c{d 



m>0 



(3.12) 



Proof. Given two sequences of non-negative real numbers Xj and yj such that 

xo + xi-\ = 1 and yo + yi -\ = 1. 

By the elementary inequality logo; ^ a; — 1, we obtain 



n>0 



n>0 



n>0 



Thus dxL ^ d^2. Now (3.12) follows from applying this inequality with 
P(S'„ = m) and then using the inequality in (i) of Theorem 3.4. 



e X^/m\ and ?/„ 



□ 



Since Z(m) ^ 1/2, from the two non-uniform estimates (?'v) and (v) of Theorem 3.4, we easily obtain 
that the Kolmogorov distance satisfies 



dKi^iS^),^i\)) :=sup 

m 

and the point metric is bounded above by 



Ci0 



dp{^{Sn),^{X)) :=sup 



P(5'„ = m) - e" 



ml 



(1-0)3/2' 



A 1 



Note that the estimate so obtained for the Kolmogorov distance is worse than that obtained by the simple 
relation dx ^ djv and the estimate (//) of Theorem 3.4. 

The quantity Z{m) can be readily bounded above by the following estimate; see also [9, p. 259] or 
[44]. 
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Lemma 3.7. 

Z{m) ^ e-(— A)V(2(m+A))_ 

Proof. Let r = m/A. If m ^ A, then 

Z{m) <: P(Ca >m)<: r-™e^("-^) = e'^'^^™/^), 
where -0(2;) := 1 — a; + a; logo;. We now prove that 



H^)^W7T^ (x>0), (3.13) 
/(I + X 



or, equivalently, 

j^'log(l + <)*S^ (x>-l). 
To prove (3.13), observe first that log(l + t) ^ + t) for t > —1 since log(l + v)dv ^ 0. Then 







ex ^ 

log(l + t)dt^ / -—dt, 
J- + f 



which is bounded below by x^/ (2(2 + x)) by considering the two cases x ^ and x G (—1,0]. Thus, by 
(3.13), 

Z{m) ^ e-('"-^)'/('("+^)). 

Similarly, if m ^ A, then r < 1, and 

Z{m) ^ P{^x ^m) ^ ^-mgA(r-l) ^ g-A^(m/A) ^ g-(r„-A)2/(2(m+A)) _ 

□ 

4 Applications. II. Second-order estimates 

We show in this section that the same approach we developed above can be readily extended for obtaining 
higher order estimates. For simplicity, we consider only the second-order estimates for which we need 
only to refine Lemma 3.3. From the formal expansion (3. 1 1), we expect that 

A™ A™ 

P(5'„ = m) — e^^ — - ^ a2e~^ — rC'2(A, m) + smaller order terms, 
ml ml 

where 02 = — A2/2, and the error terms for Poisson approximation would be smaller if we take the term 
a2e~^\'^C2{\,m)/m\ into account. 

Lemma 4.1. For any complex numbers {vk}, the following inequality holds 

n (1 + ^.)e-- -l + \Y.^l^ {j^^' + ^2^3) e^^/^ (4.1) 
where Vm is defined in (3.5), ci = y/e — 1 and (see (3.3)) 



2 



1 

- / e''/\l-t^)dt^ 0.3706. 
2 Jo 
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i<ij<k 



Proof. By (3.6), 
By (3.1), (3.2) with m = 2 and (3.4), we then obtain 



and (4.1) follows. 
For simplicity, let 

Then 



□ 



3^(^-1) ( 1-^(^-1)' 



[z"^]P,{z) = e''—(l-^C2{m,X) 



A, 



(4.2) 



'l -2 



E 



e-^^ + ^C^iKA)e-^ 
j! 2 m! 



where Ci, C2 are given in (2.3). 

With the inequality (4.1) and Proposition 2.4, we can now refine Theorem 3.4 as follows. 

Theorem 4.2. For 9 < 1, we have the following second-order estimates for x^-, total variation and 
Wasserstein distances, respectively, 

2 



E 

m>0 



^(l_^)5/2 A3/2(l-m2 
32 



- y \nsn = m)- [znpiiz)\ ^ f"'^' , + ^^'^^^ . 

9Z^I ^ " ^ ^ ^ ^ ^' 2v^(l -0)5/2 V2A3/2(l-0)2- 



m>0 



E 

m>0 



P(5„ ^ m) - [^™] 



1 - z 



^ VA 



1^2v^(l-0)2 ' A3/2(l-e)3/2 



' ^ y2c2A3 



an J second-order non-uniform estimates for Kolmogorov distance and point metric, respectively. 



FiSn ^ m) - [z'' 



1-z 



72(1-^)5/2 ' A3/2(l-0) 



+ 



|P(^„ = m)-[^™]Pi(^)| ^ 



Z{m) 



15ci^2 ^ 2V6C2A3 



A lv^(l-^)3 A3/2(l-^)5/2 
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Proof. Let 



nz)= n (l+P.(^-l))-e^(^-^)(l-^(^-l)^) 



Take vj = pj{z — 1) in inequality (4.1). Then 



It follows that 



< {jXi\z^l\' + c,X,\r.-l\')ef"-'<'. 



E 

^m>0 



he identity (2.1C 

(P(5„ = m)-[z-]Pi(^))' 



(4.3) 



Substituting this upper bound into the identity (2.10) and using the relation (4.2), we obtain 

,2\ 1/2 



Cl 



C2A, 



A3/2 



1/2 



1/2 



5^- 



24 ^ C2A3 



4" (1-0)5/2 ^3/2 (1_^)2' 

where we used the Minkowsky inequality. This proves the second-order estimate for the x^-distance. 

Similarly, the corresponding estimates for the total variation distance and the (non-uniform estimate of 
the) Kolmogorov distance follow from (4.3) and the two inequalities (2.19) and (2.22), respectively. 

For the point metric, we have, using again (4.3) and the inequality (2.20), 



A 



Z(m] 



A3/2 



1/2 



1/2 



2v^C2A3 



y2(l_^)3 A3/2(l- 0)5/2- 

Finally, the second-order estimate for the Wasserstein distance follows from (4.3) and the inequality (2.21) 



m>0 



^ m) - [z'' 



1 - z 



1/2 



4 



1/2 
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□ 

Corollary 4.3. The total variation distance between the distribution of Sn and a Poisson distribution of 
mean A satisfies, for 6 <1, 

Proof. By (2.13) with = 2, we have 

2 m!' ' ^' " v^A 

and (4.4) follows from the second-order estimate for the total variation distance in Theorem 4.2. □ 

Remark 4.4. One can easily derive, by the difference equation (2.2) of Charlier polynomials with k = 1, 
that (see for example [43]) 

9 E e-'^\C,{X,m)\ = e-' ^(m+ - A) + ^(A - m_) 



where m± := [A+^iyA+iJ. Asymptotically, for large A, 

- y e-^^|C2(A,m)| = (1 + (A-^)) . 

2 ^ m! ' ^ ' ^' v^A ^ ^ 

m^O ^ 

By a detailed calculus, Roos [70] showed that 

1 A"^ S 

(4.5) 

m>0 



where numerically 

' 1 3 



v/2' 2e' 



{0.707,0.552,0.484} 



Of course, we can apply Roos's inequality (4.5) and replace the constant 1/2'^/^ ^ 0.354. . . by 3/(4e) 
0.276 ... in the first term of our inequality (4.4). 

Corollary 4.5. The -distance satisfies 



32 



d^^iS^), ^(A)) = ^ ( 1 + O ( ^^-^ ) ) . (4.6) 



Proof Note that 



n ^ inSn = m)- [z-]P,iz)f _ ^ {nSn = m)- e-^^)^ 6^ 



g" 

m>0 " m\ m>0 



This identity together with the first estimate of Theorem 4.2 and an observation that A3 ^ X^^'^ yields 
(4.6). □ 
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Remark 4.6. An alternative way to prove (4.6) is to use the identity (3.10) and apply the estimate for the 
coefficients aj derived in Shorgin [80] 

\a,\ ^ i-jj (j > 2), (4.7) 

and obtain 

by Stirling's formula j! = 0(j^/^(j/e)-'), j ^ 1. This and 02 = — A2/2 give 

rf,2(^(^„), ^(A)) = y (1 + O [jY^ey^) ) • (4.8) 
For a further refinement of (4.6), see Corollary 5.3. Note that (4.8) implies that 



2 V v(l-^)^/^ 

5 Applications. III. Approximations by signed measures 

Since the probability generating function of Sn can be represented as 

E^^" = exp j ^ (zlli^ X.(z - ly 

it is well-known since Herrmann [39] that smaller error terms can be achieved if we use finite number of 
terms in the exponent to approximate Ez^" ; namely, 

Ez^" ^ exp ( J2 Aj(-2 - !)■' 

for k ^ 1. Anther advantage of such approximations is that the remainder terms tend to zero not only 
when 9^0 but also when A — 00 (while 9 remaining, say less than 1 ~ e, e > being a small number). 
This gives rise to Poisson approximation via signed measures (sometimes also referred to as compound 
Poisson approximations); see Cekanavicius [18], Roos [71], Barbour et al. [5] for more information. 

Although these approximations are not probability generating functions for k ^ 2, they can numeri- 
cally and asymptotically be readily computed. Indeed, for A; = 2 



.m/2 / \ I \ 



ml \ vAg 

where the Hm{xys are the Hermite polynomials. 
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5.1 Approximation by i)V2 

We consider the simplest case of such forms when k = 2. 
Lemma 5.1. The inequality 



holds for any complex numbers {vk}, where Vm is given in (3.5) and C2 in (3.3). 
Proof. Again by (3.6), 



/2 



Now 



(l + 2)e-^-e 



' ' 8 

This and the inequality (3.1) yield (5.1). 
Let 

p,(z) := eA(.-i)-A.(.-i)V2_ 
Theorem 5.2. Assume that 9 < 1. Then 

^ (FjSn = m) - [z"^]P2{z))' XI ( V6C2 



3^ 



m>0 



-A A" 



m\-\z'^\P,{z]\ < 



A3 1(1-^^)2 2v^(l-^)5/2 



A3 / 



C2 



+ 



3^ 



m>0 



A3/2 1(1- 9)2 2^2(1 - ; • 



E 

m>0 



P(^„ ^ m) - 
'„ ^ m) - 



^2(^) 



A3 / 



C2 



A 1(1-0)3/2 4^(1_^^)2 



1 - Z 



^ X^V^M I (13^ + 2v/2(l- 0)5/2 



,P(.„ ^ .) - nP2(.)i ^ ^J^3 
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Proof. All estimates follow similarly as the proof of Theorem 3.4 but with 

For the first two estimates of the theorem, we apply the inequality (5.1), which gives 

By the inequality A4 ^ Aai/A^, we obtain 



^ (P(g„ = m) -[z^]P2{z)f \^''' ^ A 



3 f / °° / _ „3/2 . 



1/2 



\m>0 



A3/2 



A3/2 1(1-^)2 8(1-^)5/2 

Then we apply Proposition 2.4. The other estimates are similarly proved. 
Lemma 5.3. For any 6 < 1, we have 

[z'"]P2{z)f 1 



E 

m>0 



m! 



-aa:: 



- 1. 



Proof. Applying (2.10) and (2.12) to the function 

F{z) = e^(^-i) - P2{z) = e^(^-i) 

we obtain 



X^Viz-lf^ 



k>l 



m>0 



{e-'^-[z-^]P,iz)y 



-XX!: 



eV'' {2k)\ 



2j {k\y 



1. 



Corollary 5.4. For 6 < 1, 



E 

\Tn>0 



{FjSn = m)- e-'^) 



2\ 1/2 



- 1 



1/2 



A3 / C2VQ V2Ae 



A3/2 1 (1 - ey 8(1 - ^)5/2 



Proof. By applying the Minkowsky inequality and the first estimate of Theorem 5.2, we obtain 



E 

\m>0 



\ \ X'"\2\ 1/2 

' rn\ / 



-A A" 



(e-^^ - [z-]P2{z)) 



2\ 1/2 



-A A" 



«lE 



iFiSn = m)-[z^]P2iz)) 



2\ 1/2 



A3 / csVe 



A3/2 1(1 -0)2 

Consequently, by (5.2), we obtain (5.3). 
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Note that (5.3) implies that, for all 6' < 1, 

On the other hand, by the inequality (i^2 ^ 4dfy (which following from (2.12) and (2.19)), we obtain 
another upper bound for cItv- 

Corollary 5.5. For 6 < 1, 

dTv[Sn,V[\))^-[^==-l] + 



2 Vv^r^ ) A3/2 I 2(1-0)2 16(1-0)5/2 



6 Comparative discussions 

We review briefly some known results in the literature and compare them in this section. For simplicity, 
we write for (i*(=Sf (5"^), ^(A)) throughout this section, where represents one of the distances we 
discuss. 

Among the five measures of closeness of Poisson approximation {d^2 , djy, dw, dx, dp}, the estimation 
of the three {d^2,dK, dp} is generally simpler in complexity since they can all be easily bounded above 
by explicit summation or integral representations: see (3.10) for dy^2, (6.2) for dx and (6.3) for dp. 

In addition to the Poisson approximations to ^(Sn) we consider in this paper, many other different 
types of approximations to ^(S^) were proposed in the literature; these include Poisson with different 
mean, compound Poisson, translated Poisson, large deviations, other perturbations of Poisson, binomial, 
compound binomial, etc. They are too numerous to be listed and compared here; see, for example, Barbour 
et al. [9], Roos [69, 72], Barbour and Chryssaphinou [7], Barbour and Chen [6], RoUin [66] and the 
references therein. 



6.1 The x^-distance and the KuUback-Leibner divergence 

Borisov and Vorozheikin [14] showed that dy^2 ~ 0^/2 under the assumption that 9 = o(A^^/^). They 
also derived in the same paper the identity (3.10) in the special case when all pj's are equal. More refined 
estimates were then given. The estimate (4.6) we obtained is more general and stronger. 

The KuUback-Leibner divergence has been widely studied in the information-theoretic literature and 
many results are known. The connection between djv and dxL for general distributions also received 
much attention since they can be used to bridge results in probability theory and in information theory; 
see the survey paper Fedotov et al. [34] for more information and references. One such tool studied is 
Pinsker's inequality djv ^ \/ dxhj'^ (see [34]). Note that in the case of Sn, this inequality implies that 
djy ^ A/(ix^/2, while we have djy ^ \/ d-^ /2 by (2.12) and (2.19). 

Kontoyiannis et al. [51] recently proved, by an information-theoretic approach, that 

The right-hand side in the above inequality is, by Cauchy-Schwarz inequality, always larger than 0^, pro- 
vided that at least one of the p/s is nonzero, and can be considerably larger than our estimate (3.12) for 
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certain cases. Indeed, take for example pj = 1/ + Then 

n3 



dKL ^ 



1 



P3 



n 



where the symbol "a„ x 6„" means that a„ is asymptotically of the same order as Our result (3.12) 
yields in this case the estimate 



dm. ^ 



1 



n 



6.2 The total variation distance 

We mentioned in Introduction some results in Le Cam [54] and other refinements in the literature of the 
form drv ^ c6'. We briefly review and compare here other results for div- 



First- and second-order estimates. Kerstan [49], in addition to proving that drv ^ 0.66* (which was 
later on corrected to 1.05 by Barbour and Hall [8]), he also proved the second-order estimate 



= 3) - e 



^A^ 



1 



A, 



^ 1.3^ + 3.90^ 
A 



Similar estimates were derived later in Herrmann [39], Chen [23], Barbour and Hall [8]. The order of the 
error terms is however not optimal for large A; see Theorem 4.2. 

Many fine estimates were obtained in the series of papers by Deheuvels, Pfeifer and their co-authors. 
In particular, Deheuvels and Pfeifer [30] proved djy ^ ^/(l ~ v^) for 6* < 1/2 and the second-order 
estimate 



E 

J>0 



P(^„ = 3) - e 



A. 



C2(A,j; 



3/2 



1 
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for 9 < 1/2, the order of the error terms being tight. For many other estimates (including higher-order 
ones), see [30, 31]. Their approach is based on a semi-group formulation, followed by applying the fine 
estimates of Shorgin [80], which in turn were obtained by the complex-analytic approach of Uspensky 
[83]. Following a similar approach, Witte [86] gives an upper bound of the form 



dTV ^ 



27r(l- 2e2p* 



for 9 < ^e"^^*, as well as other more complicated ones. Another very different form for drv can be found 
in Weba [85], which results from combining several known estimates. 

By refining further Deheuvels and Pfeifer's approach, Roos [69, 70] deduced several precise estimates 
for djv and other distances. In particular, he showed that 

\4e 6(l-v^)2 J 

when 9 < 1; see [70] and the references therein. The proof of this estimate is based on a second-order 
approximation; see (4.5). 
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Note that since div ^ 1, any result of the form djv ^ V^(^)^ for ^ ^ ^i? ^ (0, 1), also leads to an 
upper bound of the form d^v ^ c9, where 

c = sup f{t), 

^0 := min{6'i, 62}, 62 G (0, 1) solving the equation tip{t) = 1. 

Higher-order approximations based on Charlier expansion are studied in Herrmann [39], Barbour [3], 
Deheuvels and Pfeifer [30], Barbour et al. [9], Roos [69, 71]. 

Approximations by signed measures. Herrmann [39] proved that, when specializing to the case of Sn, 

'A3~ 



m) — u™-ig-^{^-i)-^2(^-i)V2 



m>0 



O 



A 



the rate being A^/^ away from optimal; see Theorem 5.2. Presman [64] considered the binomial case and 
derived an optimal error bound. Kruopis [53] extended further Presman's analysis and derived 



^ lOroAg min {l.2(T"^ + 4.2A2ct"^ 2 + + 3AX2} , 

where a := — A2 and 

w := max sup e^^^'^^-^^'l (6.1) 

which was in turn refined by Borovkov [15]. Hipp [41] discussed similar expansions for compound Poisson 
distributions and attributed the idea to Kornya [52], but his bounds are weaker for large A in the special 
case of Sn, see also Cekanavicius [18]. Barbour and Xia [11] proved, as a special case of their general 
results, that 



E IHSu = m) - [^™]e^(--i)-^2(.-i)V2 



m>0 



4A, 



A3/2(l - 26) y/l-e- maxj pj{l - pj)/X 



when 9 < 1/2. An extensive study was carried out by Cekanavicius in a series of papers dealing 
mainly with Kolmogorov's problem of approximating convolutions by infinitely divisible distributions; 
see Cekanavicius [18, 19] and the references cited there. Approximation results using signed compound 
measures under more general settings than Sn are derived in Borovkov and Pfeifer [16], Roos [71, 72] and 
Cekanavicius [19], Barbour et al. [5]. 



Other uniform asymptotic approximations. The estimate djy ~ O/y/^ire holds whenever 6^ — > 0. A 
uniform estimate of the form 

dTv = oj{e){i + o{x-^)), 

as A ^ 00, was recently derived in [44], where 

$ being the standard normal distribution function. Other more general and more uniform approximations 
were also derived in [44]. 
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6.3 The Wasserstein distance 



Deheuvels and Pfeifer [30] proved the asymptotic equivalent dw ~ X2/V2t(X, when X2/VX 00, im- 
proving earlier results in Deheuvels and Pfeifer [29]. They also obtained many other estimates, including 
the following second-order one 



for 1 6*1 ^ 1/2. Then Witte [86] gave the bound 



dw ^ 



[A]! 



25/2_)^l/2^3/2 



26 



2V27r 

for 9 < |e"2p. xia [87] showed that dw ^ Az/v^ 



log(l -2e2^'*^) 



); see also Barbour and Xia [12] for the 



estimate dw ^ 8A2/(3v^2eA). The strongest results including more precise higher-order approximations 
were derived by Roos (1999, 2001), where, in particular. 



dw ^ 



8(2-^) 
2^ ' 5(l-v^)2 



For other results in connection with Wasserstein metrics, see Deheuvels et al. [27], Hwang [43], 
Cekanavicius and Kruopis [20]. 



6.4 The Kolmogorov distance 

It is known, by definition and Newton's inequality (see Comtet [24, p. 270] or Pitman [60]), that d^ ^ 
djv ^ 2dK\ see Daley and Vere-Jones [26], Ehm [33], Roos [70]. Thus all upper estimates for djv 
translate directly to those for dx and vice versa. Also many approximation results in probability theory for 
sums of independent random variables apply to Sn- Both types of results are not hsted and discussed here; 
see for example Arak and Zaitsev [2] . 

Up to now, we only consider non-uniform bounds for dK- However, effective uniform bounds can be 
easily derived based on the Fourier inversion formula 



dx = sup 

m 
1 



1 

2^ 



-r^t^ e 



2ti 



3A(cost— 1) 



n (l+P,(e--l))e-^(^^*-) 



dt. 



(6.2) 



From (6.2) and (3.4), we have 



dK < — A2 

TT 



which, by the simple inequalities |1 — e**| ^ |t| and 1 — cost ^ 2t^ /tx"^ for t E [— tt, tt], leads to 



dK ^ — A2 

TT 



te-^-'^'/^'dt 



Cin9 



4(1 



26 



where ci7r/4 ^ 0.51. Although this bound is worse than some known ones such as dx ^ 0.365 in 
Daley and Vere- Jones [26], its derivation is very simple and self-contained, the order being also tight. 
Furthermore, the leading constant ci7r/4 can be lowered, say to 0.363ci < 0.24, by a more careful analysis 
but we are not pursuing this further here. Note that it is known that dx ~ 9/{2^/2Tce), as 9 = o(l), see 
Deheuvels and Pfeifer [30], Hwang [43], where 1/(2 v/27re) ^ 0.121. 

In a little known paper, Makabe [55] gives a systematic study of dx using standard Fourier analysis, 
improving earlier results by Kolmogorov [50], Le Cam [54], Hodges and Le Cam [42]. In particular, he 
first derived a second-order estimate from which he deduced that dx ^ 3.7^^ and 

dx ^ + o {e^ + p,e) . 



For < 1/5, he also provided a one-page proof of 

dx ^ 



256 



4(1 - 2p, - 56/2) 12 - 506 



A Le Cam-type inequality of the form dx ^ 2X2/71 was given in Franken [35], which was later refined 
to dx ^ A2/2 in Serfling [76]; see also Daley [25]. Franken [35] also proves the estimate 



d 



K 



^ - (1 - e-'^'-'^) - 

TT 1 



6 



6' 



for an explicitly given c, as well as higher-order terms for dx based on Charlier expansions. His bound 
together with dx ^ I implies dx ^ 1-9^^, improving previous estimates by Le Cam and Makabe. 

Shorgin [80] derived an asymptotic expansion for the distribution of Sn', in particular, as a simple 
application of his bounds for \aj\ (see (3.9)) and |Cfc(A, m)\. 



dx ^ 



1-V6' 



where 1/2 + ^/7/8 ^ 1.31. In Hipp [40], the upper bound 



dx^ 



TT 



Pi 



4A(1 



^ r 



was given, so that if ^ 1/4, then 



A bound of the form 



dx ^ 



7r6 



3(1- 



1.055 



dx ^ — nain 

TT 



V^6 



2(1 



was given in Kruopis [53], where he also derived 

,A(^ 



sup 



P(S'„ ^ m) - [z' 



1 



2 , 

< -cuA'? mm 

3 ^ 



¥A3/2(1- 5)3/2 



1 



where zu is defined in (6.1). Deheuvels and Pfeifer deduced several estimates for dx', in particular (see 
[30,31]) 



sup 

m 



P(^n < m) 



6' 



+ 



A, 



3 V(l- v^) A3/2 
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Note that this can also be written as 



dK-\ e-^ max <j — - A), — (A - £_) 



3 V(l- v^) 



where ^ := [A + 1/2 ± ^\ + 1/4J . 
Witte [86] then derived the estimate 



2V27r(l-e2p* 

for 6* < e~^*; see also Weba [85]. Roos [69, 70] gives, among several other fine estimates. 



1 



+ 



6 



2e 5(1 - ^fe) 



Non-uniform estimates are derived in Teerapabolarn and Neammanee [82] for general dependent sum- 
mands, which is of the form in the case of Sr, 



\3 



generally weaker than our bounds in Theorems 3.4 and 4.2, 



«(l-e-)<)m.n{l,-^} 



6.5 The point probabilities 

As for (Ik above, the point metric can also be readily estimated by using the integral representation 

1 



dp < 



2tt 



II (l+p,(e'*-l))e~*'^(^''"i)-l 



dt. 



(6.3) 



and (3.4), and we obtain for example 



dp ^ 



CiTT 



5/20 



2A(1 -0)3/2 



Classical local limit theorems for probabilities of moderate or large deviations can also be used to give 
effective bounds for the point metric dp := max^ |P(5'„ = m) — e~^A™/m! |; they are not discussed here. 

Results for dp were derived in Franken [35] but are too complicated to be described here. Kruopis [53] 
gives the estimate 



dp ^ min 



7rA(l- 0)3/2- 



A, 



as well as 



sup |P(^„ = m) - [z"']P2{z)\ ^ ^Asmin (— i 
m Sir A"^ 1 - 



4 

)2'3 



Barbour and Jensen [10] derived an asymptotic expansion; see also [3]. 
Asymptotically, as ^ 0, 

2^2^ 
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see Roos [68], where he also derived a second-order estimate for dp, which was later refined in [69, 70]. 
In particular, 

3/2 



dpJU^y\ '-'f Ve 



2 \2ej 3(1 - Vey ) y/X' 

A non-uniform bound was given in Neammanee [57, 58] of the form 



F(Sn = m)-e 



^ min |m ^, A ^} A2, 



whenever A ^ 1 . 
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